Exploring Kenya's Contributions: A Gender-Driven Analysis of Uniformed Personnel in UN Peace Missions

Task:¶

Conduct an analysis of the contributions made by uniformed personnel from "Kenya" to UN Missions, with a focus on gender breakdown. Download the historical contributions dataset from the Peace & Security Data Hub, and perform a detailed examination to uncover trends, including troop numbers, mission types, duration of engagements, and temporal changes. Output the results by creating comprehensive data visualizations using Excel.

Content:¶

Data Engeneering: Kenya’s Impact in UN Peace Missions¶

  1. Import the dataframe:

    • Use Pandas to load the data into Python.
  2. Presentation and Description:

    • Explore the dataset with commands like dataframe.head(), dataframe.shape, and dataframe.columns.
  3. Handling Duplicates and Null Values:

    • Check and drop duplicate values using dataframe.duplicated() and dataframe.drop_duplicates().
    • Manage null values with dataframe.isnull().sum() and replace NaN values in specific columns.
  4. Managing Dates and Data Types:

    • Create a 'year' column by extracting the first 4 characters from 'last_reporting_date'.
    • Inspect and adjust data types using dataframe.dtypes and convert columns with astype() for better analysis.
  5. Derived Variables:

    • Introduce new variables like 'total_personnel', 'female_percent', and 'male_percent' to enhance the dataset.
  6. Sampling for Focused Analysis:

    • Create a subset specific to Kenya, named dataframe_kenya, for a targeted analysis on contributions and personnel.
    • Display the first 20 entries of dataframe_kenya to preview the data.
  7. Data Encoding for Enhanced Analysis:

    • Data encoding is a crucial step in the analytical process, involving the transformation of categorical variables like mission_acronym and personnel_type into a numerical format.

Data Analyst: Kenya’s Impact in UN Peace Missions¶

  1. Active Engagement:

    • Kenya plays a crucial role in UN missions, contributing significantly to peace and security in African regions. Notable engagements include Sudan, South Sudan, Darfur, and the Democratic Republic of the Congo.
  2. Yearly Impact:

    • Kenya's contributions from 2010 to 2016 showed an annual impact of 10,000 to 12,000. In 2017, there was a drop to around 1600, followed by a consistent upward trajectory, reaching 5552 in 2023.
  3. Gender Parity:

    • Despite commendable gender parity in certain roles compared to the global standard, discrepancies with the UN Gender Parity Dashboard suggest potential underrepresentation of women in unaccounted roles within peacekeeping operations.
  4. Troop Contribution:

    • Kenya stands out in troop contributions, particularly in deploying Experts on Mission, demonstrating a strong commitment to global peacekeeping. Positive trends in gender parity within roles like "Individual Police" and "Staff Officer" are observed.
  5. Conclusion:

    • Kenya's concentrated role in specific regions underscores its substantial impact in UN peace missions. The study calls for a deeper examination of gender parity discrepancies and recognizes Kenya's vital contribution to global peacekeeping efforts.
  6. Annex:

    • I have developed a Power BI application to enhance accessibility and visualization. The Power BI application provides a dynamic and interactive exploration of the analysis, offering a user-friendly interface. Access the project here.

    • The map and graphics were created using Python, leveraging Pandas and Plotly libraries for visualization. Access the notebook on GitHub here.

Introduction:

  • countries.csv | Dataset Publishing Language | Google for Developers

Active Engagement:

  • Peace & Security Data Hub (un.org)

Gender Parity:

  • Gender Parity Strategy | United to Reform

Other Information:

  • Population, Surface Area, and Density Data
  • GDP and GDP Per Capita Data

Data Engeneering: Kenya’s Impact in UN Peace Missions¶

Import of the librairies and the dataframe¶

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
url = "https://api.psdata.un.org/public/data/DPO-UCHISTORICAL/csv"
dataframe = pd.read_csv(url)

Briefly view the first data¶

In [3]:
dataframe.head(20)
Out[3]:
id contribution_id last_reporting_date isocode3 mission_acronym personnel_type female_personnel male_personnel m49_code contributing_country
0 1 434570 2010-01-31 TUN MONUC Experts on Mission 0.0 31.0 788 Tunisia
1 2 434569 2010-01-31 TUN MINURCAT Troops 0.0 3.0 788 Tunisia
2 3 434568 2010-01-31 TUN MINURCAT Experts on Mission 0.0 4.0 788 Tunisia
3 4 434567 2010-01-31 TGO UNOCI Troops 0.0 314.0 768 Togo
4 5 434566 2010-01-31 TGO UNOCI Experts on Mission 0.0 7.0 768 Togo
5 6 434565 2010-01-31 TGO UNOCI Individual Police 0.0 16.0 768 Togo
6 7 434564 2010-01-31 TGO UNMIL Troops 0.0 1.0 768 Togo
7 8 434563 2010-01-31 TGO UNMIL Experts on Mission 0.0 2.0 768 Togo
8 9 434562 2010-01-31 TGO UNAMID Experts on Mission 0.0 8.0 768 Togo
9 10 434561 2010-01-31 TGO UNAMID Individual Police 0.0 1.0 768 Togo
10 11 434560 2010-01-31 TGO MONUC Individual Police 0.0 18.0 768 Togo
11 12 434559 2010-01-31 TGO MINUSTAH Individual Police 0.0 4.0 768 Togo
12 13 434558 2010-01-31 TGO MINURCAT Troops 0.0 457.0 768 Togo
13 14 434557 2010-01-31 TGO MINURCAT Individual Police 0.0 8.0 768 Togo
14 15 434556 2010-01-31 THA UNMIT Individual Police 5.0 13.0 764 Thailand
15 16 434555 2010-01-31 THA UNMIS Experts on Mission 1.0 9.0 764 Thailand
16 17 434554 2010-01-31 THA UNAMID Troops 2.0 8.0 764 Thailand
17 18 434553 2010-01-31 THA UNAMID Experts on Mission 0.0 9.0 764 Thailand
18 19 434552 2010-01-31 TZA UNOCI Troops 3.0 0.0 834 United Republic of Tanzania
19 20 434551 2010-01-31 TZA UNMIS Experts on Mission 0.0 12.0 834 United Republic of Tanzania

Presentation and description of the dataframe¶

Shape and columns :¶

In [4]:
print(dataframe.shape)
print(dataframe.columns)
(146967, 10)
Index(['id', 'contribution_id', 'last_reporting_date', 'isocode3',
       'mission_acronym', 'personnel_type', 'female_personnel',
       'male_personnel', 'm49_code', 'contributing_country'],
      dtype='object')

The DataFrame has a shape of (146967, 10), meaning it contains 146,967 rows and 10 columns.

The columns in your DataFrame are as follows:

  1. id: An identifier for each row.
  2. contribution_id: An identifier for contributions.
  3. last_reporting_date: Date of the last reporting.
  4. isocode3: A three-letter country code.
  5. mission_acronym: Acronym for the mission.
  6. personnel_type: Type of personnel involved.
  7. female_personnel: Count of female personnel.
  8. male_personnel: Count of male personnel.
  9. m49_code: A numerical country code.
  10. contributing_country: The country contributing personnel.

Let's check that it doesn't have a duplicate value¶

In [5]:
duplicates = dataframe[dataframe.duplicated(subset='contribution_id',keep=False)]

if not duplicates.empty : 
    print("There are duplicates based on the contribution_id column:")
    print(duplicates)
else :
    print("No duplicates found based on contribution_id column.")
No duplicates found based on contribution_id column.

Now let's check for the existence of null values¶

In [6]:
dataframe.isnull().sum()
Out[6]:
id                       0
contribution_id          0
last_reporting_date      0
isocode3                 0
mission_acronym          0
personnel_type           0
female_personnel        14
male_personnel          22
m49_code                 0
contributing_country     0
dtype: int64

We have identified two columns with missing values: female_personnel has 14 occurrences, and male_personnel has 22 occurrences. To address this, we are going to replace these missing values with the numeric value "0".

In [7]:
dataframe["female_personnel"].fillna(0,inplace=True)
dataframe["male_personnel"].fillna(0,inplace=True)

dataframe.isnull().sum()
Out[7]:
id                      0
contribution_id         0
last_reporting_date     0
isocode3                0
mission_acronym         0
personnel_type          0
female_personnel        0
male_personnel          0
m49_code                0
contributing_country    0
dtype: int64

Managing Dates and Data Types:¶

From the last reporting_date column, we will add the year column to our dataframe. To do this, we will retrieve the first 4 characters of the last_reporting_date values corresponding to the year.

In [8]:
dataframe["year"]=dataframe["last_reporting_date"].str[:4]

Now change the data types to analyse parity¶

In [9]:
dataframe['id'] = dataframe['id'].astype(int)
dataframe['contribution_id'] = dataframe['contribution_id'].astype(int)
dataframe['last_reporting_date'] = dataframe['last_reporting_date'].astype(str)
dataframe['mission_acronym'] = dataframe['mission_acronym'].astype(str)
dataframe['personnel_type'] = dataframe['personnel_type'].astype(str)
dataframe['female_personnel'] = dataframe['female_personnel'].astype(int)
dataframe['male_personnel'] = dataframe['male_personnel'].astype(int)
dataframe['m49_code'] = dataframe['m49_code'].astype(int)
dataframe['contributing_country'] = dataframe['contributing_country'].astype(str)
dataframe['year'] = dataframe['year'].astype(int)
print(dataframe.dtypes)
id                       int32
contribution_id          int32
last_reporting_date     object
isocode3                object
mission_acronym         object
personnel_type          object
female_personnel         int32
male_personnel           int32
m49_code                 int32
contributing_country    object
year                     int32
dtype: object

Creation of Derived Variables:¶

  1. total_personnel:

    • Combine 'female_personnel' and 'male_personnel' columns to calculate the total personnel.
  2. female_percent:

    • Compute the percentage of female personnel relative to the total personnel, rounded to two decimal places.
  3. male_percent:

    • Calculate the percentage of male personnel relative to the total personnel, rounded to two decimal places.
In [10]:
dataframe["total_personnel"] = dataframe["female_personnel"] + dataframe["male_personnel"]
dataframe["female_percent"] = round(dataframe["female_personnel"] * 100 / dataframe["total_personnel"],2)
dataframe["male_percent"] = round(dataframe["male_personnel"] * 100 / dataframe["total_personnel"],2)

These derived variables provide additional insights into the gender distribution within the personnel, offering a more comprehensive view of the data.

Sampling for Focused Analysis:¶

  1. Filtering Data:

    • Use dataframe['contributing_country'] == "Kenya" to create a new dataframe, dataframe_kenya, containing only entries from Kenya.
  2. Analysis Focus:

    • This subset allows focused analysis on Kenya's contributions, personnel, and other relevant factors.
  3. Displaying Sample:

    • Show the first 20 entries of dataframe_kenya to provide a preview for analysis.

This targeted approach enables a more efficient and specific analysis of Kenya's impact and characteristics in the dataset.

In [11]:
dataframe_kenya =dataframe[(dataframe['contributing_country'] == "Kenya")]
dataframe_kenya.head(20)
Out[11]:
id contribution_id last_reporting_date isocode3 mission_acronym personnel_type female_personnel male_personnel m49_code contributing_country year total_personnel female_percent male_percent
824 825 434193 2010-01-31 KEN UNMIS Troops 44 681 404 Kenya 2010 725 6.07 93.93
825 826 434192 2010-01-31 KEN UNMIS Experts on Mission 1 3 404 Kenya 2010 4 25.00 75.00
826 827 434191 2010-01-31 KEN UNMIS Individual Police 1 18 404 Kenya 2010 19 5.26 94.74
827 828 434190 2010-01-31 KEN UNMIL Individual Police 5 16 404 Kenya 2010 21 23.81 76.19
828 829 434189 2010-01-31 KEN UNAMID Troops 6 74 404 Kenya 2010 80 7.50 92.50
829 830 434188 2010-01-31 KEN UNAMID Experts on Mission 1 5 404 Kenya 2010 6 16.67 83.33
830 831 434187 2010-01-31 KEN MONUC Experts on Mission 1 23 404 Kenya 2010 24 4.17 95.83
831 832 434186 2010-01-31 KEN MINURCAT Troops 0 4 404 Kenya 2010 4 0.00 100.00
1575 1576 435091 2010-02-28 KEN UNMIS Troops 44 681 404 Kenya 2010 725 6.07 93.93
1576 1577 435090 2010-02-28 KEN UNMIS Experts on Mission 1 3 404 Kenya 2010 4 25.00 75.00
1577 1578 435089 2010-02-28 KEN UNMIS Individual Police 1 18 404 Kenya 2010 19 5.26 94.74
1578 1579 435088 2010-02-28 KEN UNMIL Individual Police 5 16 404 Kenya 2010 21 23.81 76.19
1579 1580 435087 2010-02-28 KEN UNAMID Troops 5 76 404 Kenya 2010 81 6.17 93.83
1580 1581 435086 2010-02-28 KEN UNAMID Experts on Mission 1 5 404 Kenya 2010 6 16.67 83.33
1581 1582 435085 2010-02-28 KEN MONUC Experts on Mission 1 23 404 Kenya 2010 24 4.17 95.83
1582 1583 435084 2010-02-28 KEN MINURCAT Troops 0 4 404 Kenya 2010 4 0.00 100.00
2464 2465 435984 2010-03-31 KEN UNMIS Troops 44 681 404 Kenya 2010 725 6.07 93.93
2465 2466 435983 2010-03-31 KEN UNMIS Experts on Mission 1 3 404 Kenya 2010 4 25.00 75.00
2466 2467 435982 2010-03-31 KEN UNMIS Individual Police 1 18 404 Kenya 2010 19 5.26 94.74
2467 2468 435981 2010-03-31 KEN UNMIL Individual Police 5 16 404 Kenya 2010 21 23.81 76.19

Data Encoding for Enhanced Analysis¶

The process of encoding data involves transforming categorical variables, such as 'mission_acronym' and 'personnel_type,' into a numerical format that can be utilized in statistical analyses. In this case, we employ one-hot encoding to convert these categorical columns into binary indicators, facilitating the exploration of relationships and patterns in the subsequent correlation map. The resulting encoded DataFrame, with new binary columns representing each category, ensures a more comprehensive and effective analysis of the dataset's underlying structure and associations.

In [12]:
columns_to_encode = ['mission_acronym', 'personnel_type']
encoded_df = pd.get_dummies(dataframe_kenya, columns=columns_to_encode)
encoded_df[encoded_df.columns.difference(dataframe_kenya.columns)] = encoded_df[encoded_df.columns.difference(dataframe_kenya.columns)].astype(int)
encoded_df.columns
Out[12]:
Index(['id', 'contribution_id', 'last_reporting_date', 'isocode3',
       'female_personnel', 'male_personnel', 'm49_code',
       'contributing_country', 'year', 'total_personnel', 'female_percent',
       'male_percent', 'mission_acronym_MINURCAT', 'mission_acronym_MINURSO',
       'mission_acronym_MINUSCA', 'mission_acronym_MINUSMA',
       'mission_acronym_MONUC', 'mission_acronym_MONUSCO',
       'mission_acronym_UNAMID', 'mission_acronym_UNIFIL',
       'mission_acronym_UNISFA', 'mission_acronym_UNMHA',
       'mission_acronym_UNMIL', 'mission_acronym_UNMIS',
       'mission_acronym_UNMISS', 'mission_acronym_UNSMIS',
       'mission_acronym_UNSOM', 'mission_acronym_UNSOS',
       'personnel_type_Experts on Mission', 'personnel_type_Individual Police',
       'personnel_type_Staff Officer', 'personnel_type_Troops'],
      dtype='object')

Data Analyst: Kenya’s Impact in UN Peace Missions¶

Kenya about All the United Nations¶

In [13]:
kenya_about_world = pd.DataFrame()
kenya_about_world['number_of_contributions'] = dataframe.groupby('contributing_country')['contributing_country'].count()
kenya_about_world['sum_female_personnel'] = dataframe.groupby('contributing_country')['female_personnel'].sum()
kenya_about_world['sum_male_personnel'] = dataframe.groupby('contributing_country')['male_personnel'].sum()
kenya_about_world['total_personnel'] =  kenya_about_world['sum_female_personnel']+kenya_about_world['sum_male_personnel']
kenya_about_world['percent_female_personnel'] = round(kenya_about_world['sum_female_personnel'] * 100 / kenya_about_world['total_personnel'],2)
kenya_about_world['percent_male_personnel'] = round(kenya_about_world['sum_male_personnel'] * 100 / kenya_about_world['total_personnel'],2)
kenya_about_world["percent_female_personnel"].fillna(0,inplace=True)
kenya_about_world["percent_male_personnel"].fillna(0,inplace=True)
In [14]:
kenya_about_world.iloc[70:91]
Out[14]:
number_of_contributions sum_female_personnel sum_male_personnel total_personnel percent_female_personnel percent_male_personnel
contributing_country
Ireland 1236 4020 64550 68570 5.86 94.14
Israel 3 0 42 42 0.00 100.00
Italy 1026 8632 178767 187399 4.61 95.39
Jamaica 191 448 673 1121 39.96 60.04
Japan 258 320 24226 24546 1.30 98.70
Jordan 3290 3036 278666 281702 1.08 98.92
Kazakhstan 295 75 4779 4854 1.55 98.45
Kenya 1939 13032 79910 92942 14.02 85.98
Kiribati 13 24 36 60 40.00 60.00
Kyrgyzstan 872 191 2515 2706 7.06 92.94
Latvia 117 21 260 281 7.47 92.53
Lesotho 89 0 140 140 0.00 100.00
Liberia 373 1264 11113 12377 10.21 89.79
Libya 2 1 4 5 20.00 80.00
Lithuania 376 286 2799 3085 9.27 90.73
Luxembourg 69 0 140 140 0.00 100.00
Madagascar 568 601 4230 4831 12.44 87.56
Malawi 1255 9195 111931 121126 7.59 92.41
Malaysia 1491 5970 139254 145224 4.11 95.89
Mali 869 771 10007 10778 7.15 92.85
Malta 61 18 546 564 3.19 96.81

It is crucial to compare Kenya's commitment to UN peace missions with its neighboring countries, as this comparative analysis provides valuable insights into regional dynamics and contributions to global peacekeeping efforts. Understanding the involvement of neighboring nations, such as Ethiopia, Somalia, South Sudan, Tanzania, and Uganda, allows for a comprehensive evaluation of the collective regional impact on UN missions.

In [15]:
sorted_df = kenya_about_world.sort_values(by="number_of_contributions",ascending=False)
sorted_df.head(3)
Out[15]:
number_of_contributions sum_female_personnel sum_male_personnel total_personnel percent_female_personnel percent_male_personnel
contributing_country
Nepal 4501 34739 781767 816506 4.25 95.75
Bangladesh 4020 40379 1204755 1245134 3.24 96.76
Ghana 3818 53534 392135 445669 12.01 87.99
In [16]:
sorted_df = kenya_about_world.sort_values(by="total_personnel",ascending=False)
sorted_df.head(3)
Out[16]:
number_of_contributions sum_female_personnel sum_male_personnel total_personnel percent_female_personnel percent_male_personnel
contributing_country
Bangladesh 4020 40379 1204755 1245134 3.24 96.76
India 2876 16714 1092456 1109170 1.51 98.49
Pakistan 2779 7578 1064238 1071816 0.71 99.29

Comparing Kenya's contributions to UN peace missions with top-contributing countries like Nepal, Bangladesh, and Ghana is crucial for benchmarking, understanding global dynamics, and optimizing resource allocation. This analysis helps identify best practices, assess the global impact, and strategically plan Kenya's future commitments in alignment with successful nations in UN peacekeeping efforts.

Comparison of Contributions and Personnel Contributions by Country¶

In [17]:
import plotly.graph_objects as go
import plotly.subplots as sp

# Total contributions
total_contribution = kenya_about_world['number_of_contributions'].sum()

# Kenya's contribution
kenya_contributions = kenya_about_world.loc["Kenya", "number_of_contributions"]

# Best contributors
nepal_contributions = kenya_about_world.loc["Nepal", "number_of_contributions"]
bangladesh_contribution = kenya_about_world.loc["Bangladesh", "number_of_contributions"]
ghana_contributions = kenya_about_world.loc["Ghana", "number_of_contributions"]

# Neighboring countries of Kenya
ethiopia_contributions = kenya_about_world.loc["Ethiopia", "number_of_contributions"]
somalia_contributions = 0
sudan_contributions = kenya_about_world.loc["Sudan", "number_of_contributions"]
tanzania_contributions = 0
uganda_contributions = kenya_about_world.loc["Uganda", "number_of_contributions"]

# Calculate contributions
best_contributions = nepal_contributions - bangladesh_contribution - ghana_contributions
neighboring_contributions = ethiopia_contributions - somalia_contributions - sudan_contributions - tanzania_contributions - uganda_contributions
other_countries = total_contribution - kenya_contributions - best_contributions - neighboring_contributions

# Data for the first pie chart
contributions = [kenya_contributions, nepal_contributions, bangladesh_contribution, ghana_contributions, ethiopia_contributions, sudan_contributions, uganda_contributions, other_countries]
labels = ["Kenya", "Nepal", "Bangladesh", "Ghana", "Ethiopia", "South Sudan", "Uganda", "Other"]

# Total personnel
total_personnel = kenya_about_world['total_personnel'].sum()

# Kenya's contribution
kenya_personnel = kenya_about_world.loc["Kenya", "total_personnel"]

# Best contributors
india_personnel = kenya_about_world.loc["India", "total_personnel"]
bangladesh_personnel = kenya_about_world.loc["Bangladesh", "total_personnel"]
pakistan_personnel = kenya_about_world.loc["Pakistan", "total_personnel"]

# Neighboring countries of Kenya
ethiopia_personnel = kenya_about_world.loc["Ethiopia", "total_personnel"]
somalia_personnel = 0
sudan_personnel = kenya_about_world.loc["Sudan", "total_personnel"]
tanzania_personnel = 0
uganda_personnel = kenya_about_world.loc["Uganda", "total_personnel"]

best_personnel = india_personnel - bangladesh_personnel - pakistan_personnel
neighboring_personnel = ethiopia_personnel - somalia_personnel - sudan_personnel - tanzania_personnel - uganda_personnel
other_personnel = total_personnel - kenya_personnel - best_personnel - neighboring_personnel
personnel = [kenya_personnel, india_personnel, bangladesh_personnel, pakistan_personnel, ethiopia_personnel, sudan_personnel, uganda_personnel, other_personnel]


fig = sp.make_subplots(1, 2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=labels, values=contributions, name="Contributions by Country"),1, 1)
fig.add_trace(go.Pie(labels=labels, values=personnel, name="Personnel Contributions by Country"),1, 2)
fig.update_layout(title_text="Comparison of Contributions and Personnel Contributions by Country")
fig.show()

Comparison of Percent Female and Male Personnel in Kenya and Rest of the World¶

In [18]:
import plotly.express as px

kenya_data = kenya_about_world.loc["Kenya"]
rest_of_world_data = kenya_about_world.drop("Kenya")

sum_female = rest_of_world_data["sum_female_personnel"].sum()
sum_male = rest_of_world_data["sum_male_personnel"].sum()
sum_total = sum_female + sum_male
avg_percent_female = sum_female * 100 / sum_total
avg_percent_male = sum_male * 100 / sum_total

data = pd.DataFrame({
    "Country": ["Kenya", "Kenya", "Rest of the World", "Rest of the World"],
    "Gender": ["Female", "Male", "Female", "Male"],
    "Percentage": [kenya_data["percent_female_personnel"], kenya_data["percent_male_personnel"], avg_percent_female, avg_percent_male]
})

fig = px.bar(data, x="Country", y="Percentage", color="Gender",
             labels={"Percentage": "Percentage"},
             title="Comparison of Percent Female and Male Personnel in Kenya and Rest of the World")

fig.show()

Correlation Map¶

A correlation map is crucial for your study as it visually represents the strength and direction of relationships between different variables, allowing you to identify patterns and dependencies in the data. This insight is essential for uncovering trends, making informed decisions, and gaining a comprehensive understanding of the factors influencing contributions by uniformed personnel from Kenya to UN Missions.

In [19]:
import seaborn as sns

numeric_columns = encoded_df.select_dtypes(include=['int32', 'float64']).columns.difference(['m49_code','contribution_id','id'])
correlation_matrix = encoded_df[numeric_columns].corr()

plt.figure(figsize=(15, 10))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Map for Kenya DataFrame (excluding m49_code)')
plt.show()

The initial correlation map provides a comprehensive overview of relationships within the dataset, where numerous correlations are observed. To focus our analysis, we will delve into specific connections, narrowing our attention to the columns 'female_personnel,' 'male_personnel,' and 'year' for a more detailed examination.

In [20]:
# Select the rows and columns you want to display
selected_rows = ['female_personnel', 'male_personnel','year']
numeric_columns = encoded_df.select_dtypes(include=['int32', 'float64']).columns.difference(['m49_code', 'contribution_id', 'id'])

selected_columns = list(set(selected_rows).union(numeric_columns))
correlation_matrix = encoded_df[selected_columns].corr()

plt.figure(figsize=(15, 3))
sns.heatmap(correlation_matrix.loc[selected_rows, selected_columns], annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Map for Kenya DataFrame (selected rows and columns)')
plt.xticks(rotation=45, ha='right')  
plt.yticks(rotation=0)  
plt.show()

In the next analysis, we are going to strategically filtered the correlation map to exclusively display relationships with a correlation coefficient greater than or equal to 0.25. By applying this threshold, we will aim to highlight and emphasize significant correlations while disregarding weaker associations, providing a clearer and more focused representation of the underlying patterns in the data.

In [21]:
# Select the rows and columns you want to display
selected_rows = ['female_personnel', 'male_personnel', 'year']
numeric_columns = encoded_df.select_dtypes(include=['int32', 'float64']).columns.difference(['m49_code', 'contribution_id', 'id'])

selected_columns = list(set(selected_rows).union(numeric_columns))
correlation_matrix = encoded_df[selected_columns].corr()

# Filter the correlation matrix based on the threshold
threshold = 0.25
significant_correlations = correlation_matrix[((correlation_matrix > threshold) | (correlation_matrix < -threshold)) & (correlation_matrix != 1)]

# Remove columns with no significant correlation
significant_columns = significant_correlations.columns[~significant_correlations.isna().all()]

plt.figure(figsize=(15, 3))
sns.heatmap(significant_correlations.loc[selected_rows, significant_columns], annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5, yticklabels=True)
plt.title('Significant Correlation Map for Kenya DataFrame (selected rows and columns)')
plt.xticks(rotation=45, ha='right') 
plt.yticks(rotation=0)  
plt.show()

Correlation-Based Hypotheses for Kenya's Contributions to UN Missions:

  1. Gender Composition in Troops:

    • Positive correlation (0.4) suggests an increase in troop numbers aligns with a higher number of female personnel.
  2. Mission Types (UNMISS):

    • Positive correlation (0.3) indicates a higher proportion of female personnel in missions like UNMISS.
  3. Total Personnel Impact:

    • Strong positive correlation (0.91 with female, 0.88 with male) implies a robust connection between overall personnel deployed by Kenya and both genders.
  4. Troops and Male Personnel:

    • Positive correlation (0.48) suggests Troops consist predominantly of male personnel.
  5. Temporal Trends:

    • Negative correlation (-0.36) between year and Troops hints at a potential decrease in troop contributions over time.
    • Positive correlation (0.37) between year and Staff Officer implies a potential rise in contributions in this category.
  6. Gender Percentages:

    • Female_percent and male_percent correlations with year and personnel types indicate nuanced temporal and mission-specific variations in gender percentages.
  7. Mission Types and Male Personnel (UNMIS):

    • Positive correlation (0.26) suggests a higher proportion of male personnel in missions like UNMIS.

These hypotheses provide a preliminary understanding for further validation through statistical tests, visualizations, and comprehensive qualitative analysis. Combining this with domain knowledge will enhance our insights into Kenya's contributions to UN Missions.

In [22]:
dataframe_kenya[['female_personnel','male_personnel','total_personnel']].describe()
Out[22]:
female_personnel male_personnel total_personnel
count 1939.00000 1939.000000 1939.000000
mean 6.72099 41.211965 47.932955
std 22.14494 133.077989 152.839133
min 0.00000 0.000000 0.000000
25% 0.00000 2.000000 3.000000
50% 1.00000 6.000000 7.000000
75% 4.00000 11.000000 14.000000
max 204.00000 829.000000 1027.000000

Summary Statistics for 'female_personnel' and 'male_personnel':

  1. Count: Both 'female_personnel' and 'male_personnel' have a count of 146,967, indicating no missing values.

  2. Mean: The mean for 'female_personnel' is approximately 4.86, while 'male_personnel' is around 92.28. On average, there are fewer female personnel compared to male personnel.

  3. Standard Deviation (std): Both columns exhibit relatively high standard deviations, signifying significant variability in personnel numbers.

  4. Minimum (min) and Maximum (max): The range spans from 0 to 720 for female personnel and 0 to 4243 for male personnel, suggesting a wide range of values and potential outliers.

  5. Percentiles (25%, 50%, 75%): The 75th percentile for female personnel is 2, indicating that 75% of observations have a count of 2 or fewer female personnel.

In summary, these descriptive statistics offer a snapshot of the distribution and central tendency of 'female_personnel' and 'male_personnel'. Further analysis, such as data visualization or hypothesis testing, may be necessary for a deeper understanding of the data distribution and relationships.

In [23]:
dataframe_kenya.describe()
Out[23]:
id contribution_id female_personnel male_personnel m49_code year total_personnel female_percent male_percent
count 1939.000000 1939.000000 1939.00000 1939.000000 1939.0 1939.000000 1939.000000 1932.000000 1932.000000
mean 81922.358948 308980.740072 6.72099 41.211965 404.0 2017.521403 47.932955 21.095880 78.904120
std 41144.041391 185362.987396 22.14494 133.077989 0.0 3.932339 152.839133 21.908605 21.908605
min 825.000000 32180.000000 0.00000 0.000000 404.0 2010.000000 0.000000 0.000000 0.000000
25% 48693.500000 73060.500000 0.00000 2.000000 404.0 2014.000000 3.000000 0.000000 69.230000
50% 84164.000000 425382.000000 1.00000 6.000000 404.0 2018.000000 7.000000 16.670000 83.330000
75% 118324.500000 468362.500000 4.00000 11.000000 404.0 2021.000000 14.000000 30.770000 100.000000
max 146802.000000 505177.000000 204.00000 829.000000 404.0 2023.000000 1027.000000 100.000000 100.000000
In [24]:
female_sum = dataframe_kenya["female_personnel"].sum()
male_sum = dataframe_kenya["male_personnel"].sum()
total_sum = female_sum + male_sum
female_percent = female_sum*100/total_sum
male_percent = male_sum*100/total_sum

print(f"Female Percent: {female_percent:.2f}%, Male Percent: {male_percent:.2f}%")
Female Percent: 14.02%, Male Percent: 85.98%

Yearly Impact:¶

In [25]:
kenya_by_year = dataframe_kenya.groupby('year')[['female_personnel','male_personnel']].sum()
kenya_by_year['total_personnel'] = kenya_by_year['female_personnel'] + kenya_by_year['male_personnel']
kenya_by_year['female_percent'] = round(kenya_by_year['female_personnel'] * 100 / kenya_by_year['total_personnel'],2)
kenya_by_year['male_percent'] = round(kenya_by_year['male_personnel'] * 100 / kenya_by_year['total_personnel'],2)

kenya_by_year['number_of_mission'] = dataframe_kenya.groupby('year').size()
kenya_by_year
Out[25]:
female_personnel male_personnel total_personnel female_percent male_percent number_of_mission
year
2010 717 9807 10524 6.81 93.19 94
2011 822 9345 10167 8.08 91.92 82
2012 826 9296 10122 8.16 91.84 97
2013 1207 8370 9577 12.60 87.40 109
2014 1651 8264 9915 16.65 83.35 126
2015 2046 8723 10769 19.00 81.00 136
2016 2416 10230 12646 19.10 80.90 146
2017 226 1126 1352 16.72 83.28 62
2018 335 1871 2206 15.19 84.81 172
2019 501 1457 1958 25.59 74.41 164
2020 498 1433 1931 25.79 74.21 186
2021 423 2232 2655 15.93 84.07 186
2022 617 3698 4315 14.30 85.70 198
2023 747 4058 4805 15.55 84.45 181

Total Personnel from Kenya in UN Missions Over Years¶

In [26]:
import plotly.graph_objects as go

# Line Plot: Total Personnel Over Years
fig1 = go.Figure()

fig1.add_trace(go.Scatter(
    x=kenya_by_year.index,
    y=kenya_by_year['total_personnel'],
    mode='lines+markers',
    name='Total Personnel',
    line=dict(color='blue')
))

fig1.update_layout(
    title='Total Personnel from Kenya in UN Missions Over Years',
    xaxis_title='Year',
    yaxis_title='Total Personnel',
    legend=dict(title='Personnel Type'),
    showlegend=True
)

# Stacked Bar Plot: Gender Distribution Over Years
fig2 = go.Figure()

fig2.add_trace(go.Bar(
    x=kenya_by_year.index,
    y=kenya_by_year['female_personnel'],
    name='Female Personnel',
    marker=dict(color='purple')
))

fig2.add_trace(go.Bar(
    x=kenya_by_year.index,
    y=kenya_by_year['male_personnel'],
    name='Male Personnel',
    marker=dict(color='orange'),
    base=kenya_by_year['female_personnel']
))

fig2.update_layout(
    title='Gender Distribution of Kenya\'s Contributions in UN Missions Over Years',
    xaxis_title='Year',
    yaxis_title='Personnel Count',
    barmode='stack',
    legend=dict(title='Gender'),
    showlegend=True
)

# Bar Plot: Number of Missions Over Years
fig3 = go.Figure()

fig3.add_trace(go.Bar(
    x=kenya_by_year.index,
    y=kenya_by_year['number_of_mission'],
    marker=dict(color='green')
))

fig3.update_layout(
    title='Number of Missions Contributed by Kenya Over Years',
    xaxis_title='Year',
    yaxis_title='Number of Missions',
    showlegend=False
)

fig1.show()
fig2.show()
fig3.show()

Active Engagement:¶

In [27]:
kenya_by_mission = dataframe_kenya.groupby('mission_acronym')[['female_personnel','male_personnel']].sum()

kenya_by_mission['total_personnel'] = kenya_by_mission['female_personnel'] + kenya_by_mission['male_personnel']
kenya_by_mission['female_percent'] = round(kenya_by_mission['female_personnel'] * 100 / kenya_by_mission['total_personnel'],2)
kenya_by_mission['male_percent'] = round(kenya_by_mission['male_personnel'] * 100 / kenya_by_mission['total_personnel'],2)
kenya_by_mission['nb_mission']=dataframe_kenya["mission_acronym"].value_counts()

kenya_by_mission
Out[27]:
female_personnel male_personnel total_personnel female_percent male_percent nb_mission
mission_acronym
MINURCAT 0 37 37 0.00 100.00 11
MINURSO 0 28 28 0.00 100.00 28
MINUSCA 362 1056 1418 25.53 74.47 196
MINUSMA 159 755 914 17.40 82.60 164
MONUC 10 126 136 7.35 92.65 6
MONUSCO 1347 9915 11262 11.96 88.04 337
UNAMID 1778 10359 12137 14.65 85.35 305
UNIFIL 64 174 238 26.89 73.11 127
UNISFA 49 109 158 31.01 68.99 60
UNMHA 13 15 28 46.43 53.57 29
UNMIL 319 1264 1583 20.15 79.85 112
UNMIS 967 13186 14153 6.83 93.17 54
UNMISS 7916 42794 50710 15.61 84.39 401
UNSMIS 0 7 7 0.00 100.00 3
UNSOM 48 20 68 70.59 29.41 41
UNSOS 0 65 65 0.00 100.00 65

Count of personnel and Gender Distribution by Mission in Kenya¶

In [28]:
unwanted_columns = ['Total of mission', 'female_percent', 'male_percent']
kenya_by_mission_subset = kenya_by_mission.drop(unwanted_columns, axis=1, errors='ignore')

kenya_by_mission_subset = kenya_by_mission_subset.select_dtypes(include='number')

sorted_data = kenya_by_mission_subset.sort_values(by='total_personnel', ascending=False)

fig = px.bar(sorted_data, x=sorted_data.index, y=['female_personnel', 'male_personnel'],
             labels={'value': 'Personnel Count', 'variable': 'Gender'},
             title='Gender Distribution by Mission in Kenya',
             color_discrete_map={'female_personnel': 'purple', 'male_personnel': 'orange'},
             width=1000, height=600)

fig.update_layout(barmode='stack', legend=dict(title='Gender', orientation='h', x=0, y=1.1),
                  xaxis_title='Mission Acronym', yaxis_title='Personnel Count')

fig.show()

Troop Contribution:¶

In [29]:
kenya_by_personnel_type = dataframe_kenya.groupby('personnel_type')[['female_personnel','male_personnel']].sum()

kenya_by_personnel_type['total_personnel'] = kenya_by_personnel_type['female_personnel'] + kenya_by_personnel_type['male_personnel']
kenya_by_personnel_type['female_percent'] = round(kenya_by_personnel_type['female_personnel'] * 100 / kenya_by_personnel_type['total_personnel'],2)
kenya_by_personnel_type['male_percent'] = round(kenya_by_personnel_type['male_personnel'] * 100 / kenya_by_personnel_type['total_personnel'],2)
kenya_by_personnel_type['nb_mission']=dataframe_kenya["personnel_type"].value_counts()

kenya_by_personnel_type
Out[29]:
female_personnel male_personnel total_personnel female_percent male_percent nb_mission
personnel_type
Experts on Mission 731 3441 4172 17.52 82.48 696
Individual Police 1304 3615 4919 26.51 73.49 398
Staff Officer 801 2416 3217 24.90 75.10 406
Troops 10196 70438 80634 12.64 87.36 439

Number of Missions in Kenya by Personnel Type¶

In [30]:
# Data
labels = kenya_by_personnel_type.index
missions_count = kenya_by_personnel_type['nb_mission']

bar_positions = list(range(len(labels)))
fig = go.Figure()

# Plot Number of Missions
fig.add_trace(go.Bar(
    x=bar_positions,
    y=missions_count,
    marker_color='green'
))

fig.update_layout(
    xaxis=dict(tickmode='array', tickvals=bar_positions, ticktext=labels),
    xaxis_title='Personnel Type',
    yaxis_title='Number of Missions',
    title='Number of Missions in Kenya by Personnel Type'
)

for i, mission_count in enumerate(missions_count):
    fig.add_annotation(
        x=bar_positions[i],
        y=mission_count + 1,
        text=str(mission_count),
        showarrow=False,
        font=dict(color='black')
    )
    
fig.show()

Distribution of Personnel in Kenya by Personnel Type¶

In [31]:
import plotly.graph_objects as go

# Data
labels = kenya_by_personnel_type.index
female_percentages = kenya_by_personnel_type['female_percent']
male_percentages = kenya_by_personnel_type['male_percent']

bar_positions = list(range(len(labels)))
fig = go.Figure()

# Plot Female Percentages
fig.add_trace(go.Bar(
    x=bar_positions,
    y=female_percentages,
    name='Female',
    marker_color='purple'
))

# Plot Male Percentages
fig.add_trace(go.Bar(
    x=bar_positions,
    y=male_percentages,
    name='Male',
    marker_color='orange',
    # Offset the male bars
    offsetgroup=1
))

fig.update_layout(
    barmode='stack',
    xaxis=dict(tickmode='array', tickvals=bar_positions, ticktext=labels),
    xaxis_title='Personnel Type',
    yaxis_title='Percentage',
    title='Distribution of Personnel in Kenya by Personnel Type'
)

for i, (female_percentage, male_percentage) in enumerate(zip(female_percentages, male_percentages)):
    fig.add_annotation(
        x=bar_positions[i],
        y=female_percentage + male_percentage + 1,
        text=f'{female_percentage + male_percentage:.2f}%',
        showarrow=False,
        font=dict(color='black')
    )

# Show figure
fig.show()

New dataframe with latitude, longitude, GDP, count of the population¶

In [32]:
dataframe_coordinates = pd.read_csv("DPO-UCHISTORICAL_coordinates2.csv",encoding = 'utf-8',sep=';')
In [33]:
dataframe_coordinates.dtypes
dataframe_coordinates["Value of GDP (million)"] = dataframe_coordinates["Value of GDP (million)"].astype(float)
dataframe_coordinates["Value of population (million)"] = dataframe_coordinates["Value of population (million)"].astype(float)
In [34]:
dataframe_coordinates_world = pd.DataFrame()
dataframe_coordinates_world['number_of_contributions'] = dataframe_coordinates.groupby('contributing_country')['contributing_country'].count()
dataframe_coordinates_world["Value of GDP (million)"] = dataframe_coordinates.groupby('contributing_country')["Value of GDP (million)"].mean()
dataframe_coordinates_world["Value of population (million)"] = dataframe_coordinates.groupby('contributing_country')["Value of population (million)"].mean()
dataframe_coordinates_world["Latitude"] = dataframe_coordinates.groupby('contributing_country')["Latitude"].mean()
dataframe_coordinates_world["Longitude"] = dataframe_coordinates.groupby('contributing_country')["Longitude"].mean()

dataframe_coordinates_world['sum_female_personnel'] = dataframe_coordinates.groupby('contributing_country')['female_personnel'].sum()
dataframe_coordinates_world['sum_male_personnel'] = dataframe_coordinates.groupby('contributing_country')['male_personnel'].sum()
dataframe_coordinates_world['total_personnel'] =  dataframe_coordinates_world['sum_female_personnel']+dataframe_coordinates_world['sum_male_personnel']
dataframe_coordinates_world['percent_female_personnel'] = round(dataframe_coordinates_world['sum_female_personnel'] * 100 / dataframe_coordinates_world['total_personnel'],2)
dataframe_coordinates_world['percent_male_personnel'] = round(dataframe_coordinates_world['sum_male_personnel'] * 100 / dataframe_coordinates_world['total_personnel'],2)

dataframe_coordinates_world.isna().sum()

dataframe_coordinates_world["percent_female_personnel"].fillna(0,inplace=True)
dataframe_coordinates_world["percent_male_personnel"].fillna(0,inplace=True)

dataframe_coordinates_world.reset_index(inplace=True)
In [35]:
dataframe_coordinates_world.columns
Out[35]:
Index(['contributing_country', 'number_of_contributions',
       'Value of GDP (million)', 'Value of population (million)', 'Latitude',
       'Longitude', 'sum_female_personnel', 'sum_male_personnel',
       'total_personnel', 'percent_female_personnel',
       'percent_male_personnel'],
      dtype='object')
In [36]:
dataframe_coordinates_world
Out[36]:
contributing_country number_of_contributions Value of GDP (million) Value of population (million) Latitude Longitude sum_female_personnel sum_male_personnel total_personnel percent_female_personnel percent_male_personnel
0 Afghanistan 1 14.939 41.13 33.939110 67.709953 0.0 0.0 0.0 0.00 0.00
1 Albania 85 18.260 2.84 41.153332 20.168331 72.0 481.0 553.0 13.02 86.98
2 Algeria 155 163.473 44.90 28.033886 1.659626 0.0 456.0 456.0 0.00 100.00
3 Angola 17 70.533 35.59 -11.202692 17.873887 17.0 17.0 34.0 50.00 50.00
4 Argentina 1290 487.227 45.51 -38.416097 -63.616672 6630.0 81531.0 88161.0 7.52 92.48
... ... ... ... ... ... ... ... ... ... ... ...
148 Vanuatu 78 981.000 0.33 -15.376706 166.959158 47.0 444.0 491.0 9.57 90.43
149 Viet Nam 474 366.138 98.19 14.058324 108.277199 1203.0 7161.0 8364.0 14.38 85.62
150 Yemen 1489 9.947 33.70 15.552727 48.516388 0.0 23853.0 23853.0 0.00 100.00
151 Zambia 2024 21.313 20.02 -13.133897 27.849332 15031.0 100111.0 115142.0 13.05 86.95
152 Zimbabwe 1509 24.118 16.32 -19.015438 29.154857 5398.0 8224.0 13622.0 39.63 60.37

153 rows × 11 columns

In [37]:
fig = px.scatter_geo(
    dataframe_coordinates_world,
    lat="Latitude",  
    lon="Longitude",  
    size="total_personnel",
    hover_name="contributing_country",
    projection="natural earth",
    title="Contributions by Country (personnel)",
    template="plotly",
    color="total_personnel",
    color_continuous_scale="Viridis", 
)

fig.update_layout(
    geo=dict(showland=True),
    margin=dict(l=0, r=0, t=40, b=0),
)

fig.show()
In [38]:
fig = px.scatter_geo(
    dataframe_coordinates_world,
    lat="Latitude", 
    lon="Longitude",  
    size="Value of GDP (million)",
    hover_name="contributing_country",
    projection="natural earth",
    title="Value of GDP (million)",
    template="plotly",
    color="Value of GDP (million)",
    color_continuous_scale="Viridis", 
)

fig.update_layout(
    geo=dict(showland=True),
    margin=dict(l=0, r=0, t=40, b=0),
)

fig.show()
In [39]:
fig = px.scatter_geo(
    dataframe_coordinates_world,
    lat="Latitude",  
    lon="Longitude", 
    size="Value of population (million)",
    hover_name="contributing_country",
    projection="natural earth",
    title="Number of inhabitants",
    template="plotly",
    color="Value of population (million)",
    color_continuous_scale="Viridis",  
)

fig.update_layout(
    geo=dict(showland=True),
    margin=dict(l=0, r=0, t=40, b=0),
)

fig.show()

Conclusion: Kenya’s Impact in UN Peace Missions¶

Kenya's role in UN peace missions stands out as a beacon of commitment and influence across various dimensions.

  1. Active Engagement:

    • Kenya plays a pivotal role in UN missions, showcasing dedication in regions such as Sudan, South Sudan, Darfur, and the Democratic Republic of the Congo. The leadership positions in key missions, including UNMISS, highlight Kenya's strategic involvement in promoting peace and security in Africa.
  2. Yearly Impact:

    • Assessing Kenya's contributions over time reveals a significant annual impact from 2010 to 2016, peaking between 10,000 to 12,000. Despite a dip in 2017, contributions have consistently increased, reaching 5552 in 2023. This temporal trend signifies Kenya's enduring commitment to global peacekeeping.
  3. Gender Parity:

    • Kenya demonstrates commendable gender parity in certain roles, yet disparities with the UN Gender Parity Dashboard hint at potential underrepresentation of women in unaccounted roles within peacekeeping operations. Further examination and inclusivity initiatives are warranted.
  4. Troop Contribution:

    • Kenya's troop contributions, particularly in deploying Experts on Mission, underscore its unwavering commitment to global peacekeeping. Positive trends in gender parity within roles like “Individual Police” and “Staff Officer” showcase Kenya's inclusive and impactful approach.

In Closing:

  • Kenya's concentrated efforts in specific regions underline its substantial impact on UN peace missions. The study emphasizes the need for a detailed examination of gender parity discrepancies, recognizing Kenya's indispensable contribution to global peacekeeping efforts.

Annex:

  • A Power BI application has been developed to enhance accessibility and visualization of the project. Explore the dynamic analysis through the Power BI link.

Additional Resources:

  • Population, Surface Area, and Density Data
  • GDP and GDP Per Capita Data